Goto

Collaborating Authors

 linear convergence rate




Sub-sampled Newton Methods with Non-uniform Sampling

Neural Information Processing Systems

We consider the regime where nd. We propose randomized Newton-type algorithms that exploit non-uniform sub-sampling of { 2fi(w)}ni=1, as well as inexact updates, as means to reduce the computational complexity, and are applicable to a wide range of problems in machine learning. Two non-uniform sampling distributions based on block norm squares and block partial leverage scores are considered. Under certain assumptions, we show that our algorithms inherit a linear-quadratic convergence rate in w and achieve a lower computational complexity compared to similar existing methods. In addition, we show that our algorithms exhibit more robustness and better dependence on problem specific quantities, such as the condition number. We empirically demonstrate that our methods are at least twice as fast as Newton's methods on several real datasets.


Asynchronous Parallel Greedy Coordinate Descent

Neural Information Processing Systems

In this paper, we propose and study an Asynchronous parallel Greedy Coordinate Descent (Asy-GCD) algorithm for minimizing a smooth function with bounded constraints. At each iteration, workers asynchronously conduct greedy coordinate descent updates on a block of variables. In the first part of the paper, we analyze the theoretical behavior of Asy-GCD and prove a linear convergence rate. In the second part, we develop an efficient kernel SVM solver based on Asy-GCD in the shared memory multi-core setting. Since our algorithm is fully asynchronous--each core does not need to idle and wait for the other cores--the resulting algorithm enjoys good speedup and outperforms existing multi-core kernel SVM solvers including asynchronous stochastic coordinate descent and multi-core LIBSVM.







16bda725ae44af3bb9316f416bd13b1b-Paper.pdf

Neural Information Processing Systems

However, since this proof relies on the existence of a convergent subsequence, their proof does not reveal any rate forglobal convergence.